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w ' Abstract 

hJ . 
^ , We investigate the properties of Inclusion Logic, that is, First Or- 

CJ | der Logic with Team Semantics extended with inclusion dependencies. 

We prove that Inclusion Logic is equivalent to Greatest Fixed Point 
C^"> . Logic, and we prove that all union-closed first-order definable proper- 

ly. \ ties of relations are definable in it. We also provide an Ehrenfeucht- 

\& ' Fra'isse game for Inclusion Logic, and give an example illustrating its 

(N 

^ '. 1 Introduction 



use. 



Inclusion Logic [10], FO(C), is a novel logical formalism designed for ex- 
pressing inclusion dependencies between variables. It is closely related to 



Dependence Logic [23], FO(D), which is the extension of First Order Logic 
by functional dependencies between variables. Dependence Logic initially 
arose as a variant of Branching Quantifier Logic pJ3] and of Independence- 
Friendly Logic [HI 122], and its study has sparked the development of a whole 
family of logics obtained by adding various dependency conditions into First 
Order Logic. 

All these logics are based on Team Semantics [HI |24] which is a gen- 
eralization of Tarski Semantics. In Team Semantics, formulas are satisfied 
or not satisfied by sets of assignments, called teams, rather than by sin- 
gle assignments. This semantics was introduced in [T6] for the purpose of 
defining a compositional equivalent for the Game Theoretic Semantics of 
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Independence- Friendly Logic [HJ [22J, but it was soon found out to be of in- 
dependent interest. See [§] for a, mostly up-to-date, account of the research 
on Team Semantics. 

Like Branching Quantifier Logic and Independence-Friendly Logic, De- 
pendence Logic has the same expressive power as Existential Second Order 
Logic Sj: every FO(D)-sentence is equivalent to some E}-sentence, and vice 
versa [23]. The semantics of Dependence Logic is downwards closed in the 
sense that if a team X satisfies a formula in a model M, then all subteams 
Y C X also satisfy <fi in M. The equivalence between FO(D) and T\ was ex- 
tended to formulas in [19], where it was proved that FO(D) captures exactly 
the downwards closed Ej-definable properties of teams. 

Other variants of Dependence Logic that have been studied are Condi- 
tional Independence Logic FO(_L c ) [12], Independence Logic FO(_L) [T2| 125], 
Exclusion Logic FO( | ) [10] and Inclusion/Exclusion Logic FO(C, | ) [10J. All 
the logics in this family arise from dependency notions that have been studied 
in Database Theory. In particular, FO(D) is based on functional dependen- 
cies introduced by Armstrong [TJ, FO(C) is based on inclusion dependencies 
[HI E], FO( | ) is based on exclusion dependencies [3], and FO(_L) is based on 
independence conditions [IT] . 

The expressive power of all these logics, with the exception of FO(C), 
is well understood. It is known that, with respect to sentences, they are 
all equivalent with T\. With respect to formulas, FO( | ) is equivalent with 
FO(D) [TO]; and FO(C, | ), FO(L c ) and FO(_L) are all equivalent to each 
other pU |25]. Moreover, FO(L c ) (and hence also FO(C, | ) and FO(J_)) 
captures all S^-definable properties of teams [10]. 

On the other hand, relatively little is known about the expressive power 
of Inclusion Logic, and the main purpose of the present work is precisely to 
remedy this. What little is known about this formalism can be found in [10J, 
and amounts to the following: With respect to formulas, FO(C) is strictly 
weaker than T\ = FO(_L c ) and incomparable with FO(D) = FO( | ). This is 
simply because the semantics of FO(C) is not downwards closed, but is closed 
under unions: if both teams X and Y satisfy a formula in a model M, then 
X UY also satisfies in M. Moreover, it is known that FO(C) is stronger 
than First Order Logic over sentences, and that it is contained in E}; but it 
was an open problem whether it it is equivalent to Sj, or whether FO(C)- 
formulas could define all union closed E}-definable properties of teams. 

In this paper we show that the answer to both of these problems is nega- 
tive. In fact, we give a complete characterization for the expressive power of 



FO(C) in terms of Positive Greatest Fixed Point Logic GFP + : We prove that 
every FO(C)-sentence is equivalent to some GFP + -sentence, and vice versa 
(Corollary H7|) . Moreover, we prove that a property of teams is definable by 
an FO(C)-formula if and only if it is expressible by a GFP + -formula in a 
specific way (Theorems [T5l and [T6l) . 

Fixed point logics have a central role in the area of Descriptive Complex- 
ity Theory By the famous result of Immerman [T7] and Vardi [2E], Least 
Fixed Point Logic LFP captures PTIME on the class of ordered finite mod- 
els. Furthermore, it is well known that on finite models, LFP is equivalent 
to GFP + . Thus, we obtain a novel characterization for PTIME: a class of 
ordered finite models is in PTIME if and only if it is definable by a sentence 
ofFO(C). 

In addition to the equivalence with GFP + , we prove that all union- 
closed first-order definable properties of teams are definable in Inclusion 
Logic (Corollary 126]) . Thus, it is not possible to increase the expressive 
power of FO(C) by adding first-order definable union-closed dependencies. 
On the other hand, it is an interesting open problem, whether FO(C) can 
be extended by some natural set D of union-closed dependencies such that 
the extension FO(C,D) captures all union-closed E^-definable properties of 
teams. 

We also introduce a new Ehrenfeucht-Fraisse game that characterizes the 
expressive power of Inclusion Logic (Theorem |2"§|) . Our game is a modifica- 
tion of the EF game for Dependence Logic defined in [23]. Although the EF 
game has a clear second order flavour, it is still more manageable than the 
usual EF game for E}; we illustrate this by describing a concrete winning 
strategy for Duplicator in the case of models with empty signature (Propo- 
sition |3n])- Due to the equivalence between FO(C) and GFP + we see that 
the EF game for Inclusion Logic is also a novel EF game for GFP + ; it is 
quite different in structure from the one introduced in [2]. It may be hoped 
that this new game and its variants could be of some use for studying the 
expressive power of fixed point logics. 



2 Preliminaries 

2.1 Team Semantics 

In this section, we will recall the definition of the Team Semantics for First 
Order Logic. For simplicity reasons, we will assume that all our expressions 
are in negation normal form. 

Definition 1 Let M be a first order model and let V be a set of variables. 
A team X over M with domain Dom(X) = V is a set of assignments s : 

V — > Dom(M). Given a tuple t = (ti,...,t n ) of terms with variables in 

V and an assignment s G X , we write t(s) for the tuple (ti(s), . . . ,t n (s)), 
where t(s) denotes the value of the term t with respect to s in the model M. 
Furthermore, we write X(t) for the relation {t(s) : s E X}. 

A (non- deterministic) choice function for a team X over a set A is a 
function H : X — y V(A) \ {0}. The set of all choice functions for X over A 
is denoted by C(X,A). 

Definition 2 (Team Semantics for First Order Logiaj) Let M be a first 
order model and let X be a team over it. Then, for all first- order literals a, 
variables v, and formulas and ip over the signature of M and with free 
variables in Dom(X) ; 

TS-lit: M \= x d iff for all s G X, M \= s a in the usual Tarski Semantics 
sense; 

TS-V: M \= x <fr V if) iff X = Y U Z for some Y and Z such that M \= Y <fi 

and M \=z ip; 

TS-A: M hx A ip iff M |= x and M ^ x ip; 

TS-3: M \=x 3w0 iff there exists a function H G C(X, Dom(M)) such that 
M \=x[H/v] i>, where X[H/v] = {s[m/v} : s G X,m G H(s)}; 

TS-V: M |= x \/v<p iff M \= X [m/v] 0, where X[M/v] = {s[m/v] : s G X,m G 
Dom(M)}. 

The next theorem can be proved by structural induction on 0: 

Theorem 3 (Team Semantics and Tarski Semantics) For all first or- 
der formulas <p(v), all models M and all teams X , M \=x if and only if 
for all s G X, M \= s with respect to Tarski Semantics. 



Thus, in the case of First Order Logic it is possible to reduce Team 
Semantics to Tarski Semantics. What is then the point of working with the 
technically more complicated Team Semantics? As we will see in the next 
subsection, the answer is that Team Semantics allows us to extend First 
Order Logic in novel and interesting ways. 

Note that on every model M, there are two teams with empty domain: 
the empty team 0, and the team {0} containing the empty assignment 0. 
All the logics that we consider in this paper have the empty team property: 
M |=0 for every formula and model M. Thus, we say that a sentence <fi 
is true in a model M if M \={%} 4>- If this is the case, we drop the subscript 
{0}, and write just M \= <fi. 

2.2 Dependencies in Team Semantics 

As we saw, in Team Semantics formulas are satisfied or not satisfied by sets 
of assignments, called teams; and a team corresponds in a natural way to a 
relation over the domain of the model. Therefore, any property of relations 
can be made to correspond to some property of teams, which we can then 
add to our language as a new atomic formula. In particular, we can do 
so for database-theoretic dependency notions, thus obtaining the following 
generalized atomsu 

Definition 4 (Dependence Atoms) Let ti, t 2 , t$ be tuples of terms over 
some vocabulary. Then, for all models M and all teams X over M whose 
domain contains the variables oftit 2 t 3 , 

TS-fdep: M ^ x =(ti,t 2 ) if and only if, for all s,s' G X, t^s) = ti(s') => 
t2(s)=t 2 (s'); 

TS-exc: For \ti\ = \t 2 \, M \= x h \ t 2 if and only if X{t x ) H X(t 2 ) = 0; 

TS-inc: For |£| = \t 2 \, M \= x h C t 2 if and only if X{t x ) C X(t 2 ); 

TS-ind: M \=x t\l.t 2 if and only if for all s,s' G X there exists a s" G X 
with ^(s") = fi(s) and t 2 (s") = t 2 (s'); 

TS-cond-ind: M \=x t 2 -L^ts if and only if for all s, s' G X with tx(s) = 
ti(s') there exists a s" G X with (tit 2 )(s") = (tit 2 )(s) and (tit 3 ){s") = 
(titsW). 



2 The notion of "generalized atom" is denned formally in [20] 



These atoms correspond respectively to functional dependencies [TJ, to 
exclusion dependencies [I], to inclusion dependencies [SJ[3J, to independence 
conditions jTTJ, and to conditional independence conditions^, and by adding 
them to the language of First Order Logic we can obtain various logics, whose 
principal known properties we will now briefly recall. 

Dependence Logic FO(D) is obtained by adding functional dependence 
atoms to the language of First Order Logic. It is the oldest and the most 
studied among the logics that we will discuss in this work, having been 
introduced in the seminal book [21] as an alternative approach to the study 
of Branching [T3] and Independence- Friendly [T4"| 122] Quantification. It is 
downwards closed, in the sense that, for all models M, Dependence Logic 
formulas and teams X, if M \=x 4> then M \=y 4> for all subsets Y of X. 

On the level of sentences, Dependence Logic has the same expressive 
power as Existential Second Order Logic E}. 

Theorem 5 ( |27|, IHi 124] ) Every FO(D) -sentence is equivalent to some T\- 
sentence, and vice versa. In particular, FO(D) captures NP on finite models. 

The equivalence between FO(D) and T\ was extended to formulas by 
Kontinen and Vaananen, who proved the following characterization: 

Theorem 6 ( [19J) Let <fi be a FO(D)-formula with free variables in v. Then 
there exists a Y\-sentence &{R), where R is a \v\-ary relation symbol which 
occurs only negatively in $, such that 

M \= x (j) ^=^ {M,X(v}) |= $(R) for all models M and teams X ^ 0. 

Conversely, for any such $(-R) there exists an FO(D) -formula such that 
the above holds. 

Thus, FO(D) is the strongest logic that can be obtained by adding In- 
definable downwards-closed dependence conditions to First-Order Logic. In- 
deed, any such condition will be expressible as 3S(X(v) C S A $(5')) for 
some $ in Ej, and therefore it will be equivalent to some FO(D)-formula. 

Exclusion Logic FO( | ), on the other hand, is the logic obtained by 
adding exclusion atoms to First-Order Logic. It was introduced in [TU] , where 
it was shown to be equivalent to Dependence Logic with respect to formulas. 



3 As observed in [7], conditional independence atoms also correspond to embedded mul- 
tivalued dependencies. 
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Conditional Independence Logic FO(_L c ), which was introduced in 
adds conditional independence atoms t 2 -Lti*3 to the language of First 
Order Logic. Like FO(D), FO(_L c ) is equivalent to Sj with respect to sen- 
tences, and also with respect to formulas: 

Theorem 7 ( |12j ) Every FO(_L c ) -sentence is equivalent to someY\- sentence, 
and vice versa. 

Theorem 8 ([10J) A class of relations is definable in Conditional Indepen- 
dence Logic if and only if it contains the empty relation and it is Ej- definable. 

Therefore, Conditional Independence Logic is the strongest logic that can be 
obtained by adding Ej-defmable dependencies which are true of the empty 
relation to First Order Logic. In particular, this implies that every FO(D) 
formula (and, therefore, every FO( | ) formula) is equivalent to some FO(_L c ) 
formulajj However, the converse is not true, since FO(_L c ) formulas are not, 
in general, downwards closed. 

Furthermore, Inclusion/Exclusion Logic FO(C, | ) - that is, the logic 
obtained by adding inclusion and exclusion dependencies to First Order Logic 
- was proved in [10] to be equivalent with FO(_L c ) with respect to formulas. 

Finally, Independence Logic FO(L) is the logic obtained by adding 
only non-conditional dependence atoms £i-L£ 2 to First Order Logic. As 
proved in [22], Independence Logic and Conditional Independence Logic are 
also equivalent with respect to formulas. 

Inclusion Logic FO(C) is obtained by adding inclusion atoms to First 
Order Logic. It is not downwards closed, but it is closed under unions in 
the following sense: if <ft is an FO(C)-formula, M is a model, and Xi, i & I, 
are teams on M such that M \=x, 4> for all i & I, then M \=x <P, where 

X = \J ie i X i- ( For a P roof > see HDJ). 

Relatively little is known about the expressive power of FO(C), and the 
main purpose of the present work is precisely to remedy this. Here we only 
recall the following results from |10j : 

1. On the level of formulas, FO(C) is strictly weaker than FO(_L c ) = 
FO(L) = E\, and incomparable with FO(D) = FO( | ). 



4 This was already shown in |12] , in which it was shown that any dependence atom 
-{tiiti) is equivalent to the conditional independence atom i2-Lj* ti- 



2. The complement of the transitive closure of any first-order formula 
4>(x,y) is definable in FO(C); hence, FO(C) is strictly stronger than 
First Order Logic on sentences. 

3. On the level of sentences, FO(C) is contained in T,\. 

We give next a couple of further examples of the expressive power of 
FO(C). 

Example 9 (a) Consider the sentence <fi := 3x3y(y C x A Exy). Let M = 
(pom(M),E M ) be a finite model. Then M \= <f> if and only if E contains a 
cycle, i.e., there are ao, . . . , a n -\ G Dom(M) such that (a^, aj + i) G E for all 
i < n — 1, and (a n _i, a ) G E M . 

The idea here is the following: by the lax semantics, the first existential 
quantifier gives a set C of values for x, and the formula 3y(y C x A Exy) 
then says that for every a G C there is a b G C such that (a, b) G E M . 

(b) Let ip be the FO(C) -sentence 3w(3u(Pu A u C w) A Vu(Ewu — > 
3v(EuvAv C w))). Then M (= tp if and only if player I has a winning strategy 
in the following game G(M): Player I starts by choosing some element a$ G 
P M . In each odd round i + 1, player II chooses an element a i+1 such that 
(aj, ai+i) G E M . In each even round i + 1, player I chooses an element Oj+i 
such that (ai, Oj+i) G E M . The first player unable to move according to the 
rules, loses the game. Player I wins all infinite plays of the game. 

The class K of all finite models M such that player II has a winning 
strategy in G(M) is an equivalent to Immerman's alternating graph accessi- 
bility problem, AGAP. It is well known that AGAP is a complete problem 
for PTIME with respect to quantifier free reductions (\T8$ ). 

2.3 Greatest Fixed Point Logic 

Let ip{R, x) be a first-order formula such that the arity of R, ax(R), is equal 
to the length k = \x\ oi the tuple x. If M is a model, then ip defines an 
operation T = Fm,^ on the set 7 : '(Dom(M) fc ) of k-aiy relations on Dom(M) as 
follows: 

T(P) := {a : (M,P) \=, m ip(R,x)} for each P G P(Dom(M) fc ). 

A relation P is a fixed point of the operation Tm,ip on M if T(P) = P. 
Furthermore, P is the greatest fixed point (least fixed point) of T M ^ if Q C P 
(P C. Qj respectively) for all fixed points Q of T M ^. 

8 



It is well known that if R occurs only positively in ip, then for every model 
M, Tm,tP has a greatest fixed point (as well as a least fixed point). Moreover, 
the greatest fixed point P of V M,tp has the following characterization: P = 
\J{Q C Dom(M) fc : Q C V M AQ)} (see, e.g. [21]). 

Definition 10 Greatest Fixpoint Logic, GFP, is obtained by adding to First 
Order Logic the greatest fixed point operator [gfp R ^ip(R,x)}t, where R is a 
relation variable with ar(i?) = \x\, ip(R,x) is a formula in which R occurs 
only positively, and t is a tuple of terms with \t\ = \x\. The semantics of the 
operator gfp is defined by the clause: 

• M \= s [gfp RS ip(R, x)]t if and only if t{s) is in the greatest fixed point 
ofT M ^. 

Positive Greatest Fixed Point Logic, GFP + , is the fragment of Greatest 
Fixed Point Logic in which fixed point operators occur only positively. 

Least Fixpoint Logic, LFP, similarly, introduces an operator 
[\fp R gip(R, x)]t, again for R occurring only positively in ip, such that M \= 
[\fp RS i[)(R,x)]t if and only ift{s) is in the least fixed point ofTM,^- 

Fixed point logics have been the object of a vast amount of research, 
especially because of their applications in Finite Model Theory and Descrip- 
tive Complexity Theory. In particular, Least Fixed Point Logic captures the 
complexity class PTIME that consists of all problems that are solvable in 
polynomial time: 

Theorem 11 ( |17|, 126] ) A class of linearly ordered finite models is definable 
in LFP if and only if it can be recognized in PTIME. 

Another important result is that on finite models, Greatest Fixed Point 
Logic has the same expressive power as Least Fixed Point Logic. 

Theorem 12 (|17j) Over finite models, GFP + (as well as GFPj is equiva- 
lent to LFP. 

We will also make use of the following normal form result for Positive 
Greatest Fixed Point Logic: 

Theorem 13 ( [23J, 117] ) Every GFP + -sentence <p is equivalent to a GFP + - 
sentence of the form 3z[gfp R gtp(R,x)]z, where ip is a first-order formula. 



3 Inclusion Logic captures GFP + 

We will now prove that Inclusion Logic has exactly the same expressive power 
as Positive Greatest Fixed Point Logic. Since the semantics of GFP + is 
defined in terms of single assignments instead of teams, the equivalence of 
FO(C) and GFP + on formulas has to be formulated in a bit indirect way; 
see Theorems [TJ] and [TBI below. 

We start with a lemma that connects teams and the greatest fixed point 
operator: 

Lemma 14 Let i/j(S,x) a GFP + -formula with free variables in 

x = (xi, . . . , x n ) such that S is n-ary and occurs only positively in ip, let M 

be a model, and let Y a team on M. 

(a) If(M,Y(x}) K *l>(S,x) for all s G Y, then M \= s [gfp s ^(S,x)]x for 

all s eY. 

(b) If Y is a maximal team such that M \= s [gfp S gip(S,x)]x for all s G F , 

then (M, Y(x)) \= 3 ^{S, x) for allseY. 

Proof. Note that (M, Y(x)) \= s ip(S, x) for all s G Y if and only if Y(x) C 
^M,ip{Y(x))- Thus, claim (a) follows from the fact that the greatest fixed 
point of Tm,-iP is the union of all relations Q such that Q C Ym,^{Q)- Claim 
(b) follows from the observation that if Y is a maximal team such that 
M \= s [s^Psx^i.^^)]^ f° r an s G F, then Y(x) is the greatest fixed point of 

We will next prove that every FO(C)-formula can be expressed in GFP + . 

Theorem 15 For every FO(C)-formula (j){x) with free variables in x = 
(xx, . . . ,x n ) there is a GFP + -formula (ft* = 4>*(R,x) such that aoc(R) = \x\, 
R occurs only positively in <p* , and the condition 

M \= x <j)(x) ^^ (M, X{x)) hs 4>*(R, x) for all s G X 

holds for all models M and teams X with Dom(X) = {xi, . . . , x n }. 

Proof. The proof is by structural induction on <fi. 

1. If 4>(x) is a first-order literal, let <ft*(R,x) be just <f>(x). Then we have 

M |= x (f)(x) ^^ M hs <l>{x) for all s G X 

^^ (M,X(x)) h s 4>{x) for all s G X. 
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2. If <p(x) is an inclusion atom t\ C £ 2 > let <f)*(R,x) be 3z(Rz A ii(x) = 
t 2 (^)), where 2 is a tuple of new variables. Note that (M,X(x)) \=h 
t\(x) = t?,{z) for an assignment h defined on xz if and only if there 
are two assignments s,s' defined on x such that ti(s) = ^(s') and 
h = s U (s' o /), where / is the function f(zi) = X{. Thus, we see that 
(M,X(x)) \= s 4>*(R,x) for all s G X if and only if for every s G X 
there is an s' G X such that £i(s) = ^(s'), as desired. 

3. Assume next that </>(x) is of the form ip(x) V 0(af). Then we define 

<P(R,x) := [gfp^(i?x Af(5,x))]x V [gfp Ty£ (Rx A 6*(T,x))]x. 

If M |=x <j>{x), then there are teams Y and Z such that X = Y U Z, 
M \=y ip(x) and M \=z 4>{x). By induction hypothesis, (M,Y(x)) \= s 
ip*(S, x), and consequently (M, X(x), V(^) \= s RxAip*(S, x), holds for 
all s G y. Hence, by Lemma Hit (M,X(x)) |= s [gfp S) g(RxAip*(S,x)))x 
holds for all s G y. 

In the same way we see that (M,X(x)) \= s [gfp T g(Rx A 8*(T,x))]x 
holds for all s £ Z. Thus, we conclude that (M,X(x)) \= s 4>*(R,x) for 
all s G X. 

To prove the converse, assume that (M,X(x)) \= s (p*(R,x) for all s G 
X. Let y be the set of all assignments s G X that satisfy the first 
disjunct of (f>*(R,x), and let Z be the set of assignments s G X that 
satisfy the second disjunct. Then Y is the maximal team such that, 
for all s G Y, (M,X(x)) |= s [gfp s>s (Rx A ip*(S,x))]x. It follows from 
Lemma[Hthat (M,X(x),Y(x)) \= s RxAip*(S,x) for all s G Y. Thus, 
(M,Y(x)) \= s ip*(S,x) for all s E Y, and by induction hypothesis, 
M \=y ip(x). In the same way we see that M \=z 9{x). Finally, since 
X = Y U Z, we conclude that M \=x <P(x). 

4. If (j)(x) = ip(x) A 9(x), we define simply <j>*(R, x) := ip*(R, x) A 9*(R, x). 
The claim follows then directly from the induction hypothesis. 

5. If 4>{x) is of the form 3vip(xv), let <p*(R,x) be 

M&Ps,xv (RxAip*(S, xv))]xv 

. Then M \=x 4>{x) if and only if there is a function H G C(X, Dom(M)) 
such that M \=y ip(xv), where Y = X[H/v]. By the induction hypoth- 
esis, this is equivalent to (M,Y(xv)) \=h ip*(S,xv) being true for all 
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h G Y. This, in turn, is equivalent with the condition 

(M,X(x),Y(xv)) K RxAip*(S,xv) for all heY. (1) 

If condition (CQ) holds, then by Lemma [141 (M,X(x)) \=h [gfPsxv (Rx A 
ip*(S,xv))]xv holds for all h eY. Since every s G X has an extension 
heY, it follows that (M,X(x)) |= s <p*(R,x) for all s G X. 

On the other hand, if (M,X(x)) \= s <p*(R,x) for all s G X, we define 
i? G C(X, Dom(M)) to be the function such that 

H{s) := {a G Dom(M) : (M,X(x)) \= a[a/v] [gfp s>3v {RxAiP*{S,xv))]xv}, 

and let Y = X[H/v]. Then Y is the maximal team such that 

(M,X(x)) K [gfp s , Sv (Rx*r(S,xv))}xv 

for all heY, whence condition ([I]) follows from Lemma dU 

6. If 4>(x) is of the form Vi> ip(xv), let <fi*{R, x) be 

Vu[gfp Sift , (i2x A V>*(£,^))](ot)- 

The proof of the claim is similar to the case of existential quantification. 

In proving that GFP + -sentences can be expressed in FO(C) we will use 
the normal form given in Theorem [T3j Thus, it suffices to find translations 
for first-order formulas, and formulas obtained by a single application of the 
gfp-operator to first-order formulas. 

Theorem 16 Let r)(R, x, y) be a first- order formula such that R occurs only 
positively in r\, ar(i?) = |x| = n, and the free variables of rj are in xy. 

(a) There exists an FO(C) -formula rj + (x, y) such that for all models M and 

teams X on M 

M \=x V + (^^y) ^=^ (M,X(x)) \= 8 r](R,x,y) for every s G X 

(b) If y is empty, and z is an n-tuple of variables not occurring in rj, then 

there exists an FO(C) -formula fj(z) such that for all models M and 
teams X on M 

M \= x fj(z) <^^ M \= s [gfp R> g rj(R,x)]z for every s G X 
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Proof, (a) We prove the claim by structural induction on rj. 

1. If r)(R,x,y) is a first-order literal not containing the relation symbol 
R, we define i] + := rj. Then M \=x t] + if an d only if M \= s r\ for 
every s G X. Since R does not occur in 77, this is equivalent with 
(M, X(x)) \= s rj for all s G X, as required. 

2. If 77 is of the form Rt, we define r] + (x, y) :— t C rr. Then we have 

M |= x 77 + (£, y) ^^ Vs G X 3s' G X : f(s) = x(s') 
^^ Vs G X : f(s) G X(x) 
^^ Vs G X : (M,X(£)) |= fl i?£ 

3. If i] is of the form a(R, x, y) V fl(R, x, y), let u = (u\, . . . , u n ) be a tuple 
of new variables and let 7] + (x, y) be the formula 

3u ( ( u C x) A (a + (-u, rcy) V /3 + (w, a;?/)) 

Here we assume as induction hypothesis that M \=y a + (u,xy) if and 
only if (M, F(w)) \=h a(R, x, y) for all h <EY, and similarly for (3 + (u, xy) 
and f3(R,x,y). 

Suppose first that M \=x f] + (x,y). Then there is a function H G 
C(X, Dom(M) n ) such that X[H/u](u) C X(£), and furthermore, X[#/w] 
can be split into two subteams F and Z such that M |=y a + (u,xy) 
and M |=^ (3 + (u,xy). Now take any s G X and let h G X[if/w] 
be an extension of s. If h G Y then (M, Y(w)) |=/i a(R,x,y). Since 
F(/u) C X[if/-u](-u) C X(x), xy(h) = xy(s) and R occurs only posi- 
tively in a, we have (M,X(x)) \= s a(R,x,y). Similarly, if h G Z then 
(M,X(f)) |= s P(R,x,y). Thus, (M,X(f)) K a(R,x,y) V P(R,x,y) 
for all s G X, as required. 

Conversely, suppose that for any s G X, (M, X(a ? )) |= s a(R,x,y) V 
(3(R,x,y). Now let if G C(X,Dom(M) n ) be the function such that 
H(s) = X(x) for all s E X. Note first that clearly M \=x[h/u] u C x. 
Let Y = {h e X[H/u] : (M,X(f)) \= h a(i?,x,y)} and Z = {h G 
X[#/u] : (M,X(f)) \= h (3(R,x,y)}. By hypothesis, X[#/t2] = Y U Z. 

If y ^ 0, then y(u) = X[H/u](u) = X(x): indeed, if (M,X(x)) |= h 
a(R,x,y) then the same holds for all h' which differ from h only with 
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respect to u, since u is not free in a. Therefore (M, Y(u)) \=h a(R, x, y) 
for all h E Y, and thus M \=y a + (u,xy). If instead Y = 0, then 
M \=y a + (u,xy) trivially. Similarly, M \=z (3 + (u,xy), and therefore 
M \=x[h/u\ a + (u,xy) V (3 + (u,xy), whence the function H witnesses 
that M \= x r] + . 

4. If r] is a(R, x, y) A 0(R, x, y), let r] + (x, y) be a + (x, y) A (3 + (x, y). Then 
the claim follows directly from the induction hypothesis. 

5. If i](R,x,y) is 3v a(R, x, yv), let r] + (x,y) be 3v a + (x, yv); here we as- 
sume w.l.o.g. that v is not among the variables in xy. Then M \=x 
r] + (x, y) if and only if there is a function H e C(X, Dom(M )) such that 
M \=x[h/v] a + (x,yv). Since X[H/v](x) = X(x), by induction hypoth- 
esis this is equivalent with the condition 

(M, X(x)) |= h a(R, x, yv) holds for all h G X[H/v]. (2) 

If condition ([2]) is true, then clearly (M, X(x)) \= s r](R,x,y) for all 
s G X. Conversely, if (M, X(x)) |= s r](R,x,y) holds for all s G X, 
then 02]) is true for the function H such that H(s) = {a G Dom(M) : 
(M,X(f)) (= 8 [ /e] a(R,x,yv)}. 

6. If rj(R,x,y) is Vf a(R,x,yv), let r] + (x,y) be Wv a + (x,yv). The proof 
of the claim is similar as in the previous case. 

(b) Let z be an n-tuple of variables not occurring in 77. We define fj(z) 
to be the formula 3x(z CfA ?7 + (x)), where r] + is the FO(C)-formula corre- 
sponding to r](R,x), as given in claim (a). Suppose first that M \= x ?](-?)• 
Then there is a function H G C(X, Dom(M) n ) such that M \=y f] + (x), and 
z(h) G F(x) for all h G K, where K = X[/J/x]. Thus, by claim (a), 
(M, y (x)) |=/! 1](R, x) holds for all h G Y". It follows now from Lemma [T41 that 
M |=/j [gfp B £77(ii!, x)]x for all h E Y. Since every s G X has an extension 
h EY, and z(s) = z(h) E Y(x), we conclude that M \= s [gfp RS r)(R, x)]z for 
allsGX. 

To prove the converse, assume that M \= s [gfp Rcj; ri(R, x)]z for all s E X. 
Let P be the greatest fixed point of the formula rj(R, x) (with respect to R and 
x) on the model M, and let H E C(X, Dom(M ) n ) be the function such that 
H(s) = P for every s E X. Let Y = X[H/x\. Then (M,Y(x)) |= h ri(R,x) 
for all h E Y, whence by claim (a), we have M \=y f] + (x). Moreover, 
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z(h) G Y(x) = P for all h G H, whence M \=y zQx. Thus, the function H 
witnesses that M \=x 3i(zC x Ar] + (x)). 

Note that in the case of disjunction above, it was necessary to "store" the 
possible values of x into the values of a new tuple u of variables: otherwise, 
by splitting the team X into two subteams we could have lost information 
about X(x). 

The equivalence of FO(C) and GFP + for sentences follows now from the 
two theorems above: 

Corollary 17 For any FO(C) -sentence <p there exists an equivalent GFP + - 
sentence 6, and vice versa. 

Proof. If is an FO(C)-sentence, then by Theorem [T5J there is a formula 
4>*(R,x) such that for all models M and teams X, M \=x 4> if and only if 
(M,X(x)) K <t>*{R,x) for all s G X. Thus, M |= if and only if M |= 
^x[gfp R)X (j)*(R,x)]x. 

On the other hand, if ip is a GFP + -sentence, then by Theorem [131 we 
can assume that it is of the form 3z[gfp R£ r](R, x)]z, where r\ is a first-order 
formula. It follows now from Theorem IToT b) that ip is equivalent to the 
FO(C)-sentence 3zfj(z). 

Corollary 18 A class of linearly ordered finite models is definable in FO(C) 
if and only if it can be recognized in PTIME. 

This connection between Inclusion Logic, Fixed Point Logic and descrip- 
tive complexity may be of great value for the further development of the area. 
In particular, it implies that fragments and extensions of FO(C) can be made 
to correspond to various fragments and extensions of PTIME. Hence, results 
concerning their relationships may lead to insights which may be valuable in 
complexity theory, and vice versa. 

4 First-Order Union Closed Properties 

From Corollary [T7| it follows immediately that Inclusion Logic is strictly 
weaker than T,\. As an immediate consequence, not all Sj-definable union- 
closed properties of relations can be expressed in Inclusion Logic. For exam- 
ple, consider the atom 
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TS-7£: M |= lZ(xyzw) if and only if there exist two functions 

/, g : Dom(M) ->■ Dom(M) 
such that, for all a,b G Dom(M), 

(a, /(a), 6, g-(fo)) G X(xyzw). 

It is easy to see that the atom TZ is union-closed. On the other hand, 
it can be seen that that the sentence \/x3y\/z3w(JZ(xyzw) A (x = z -H- y = 
w ) A (y = z — > x = w) A x 7^ y) holds in a finite model if and only if it 
contains an even number of elements. Since even cardinality is not definable 
in GFP, it follows that 1Z is not definable in FO(C). 

But what about first order definable union-closed properties? As we 
will now see, all such properties are indeed definable in Inclusion Logic; and 
therefore, it is not possible to increase the expressive power of Inclusion Logic 
by adding any first order definable union-closed dependency. 

Definition 19 A sentence <ft(R) is myopic if it is of the form Vx(i?x — > 
0(R, x)) for some first-order formula 6 in which R occurs only positively. 

It follows at once from Theorem [16] that myopic sentences correspond to 
Inclusion Logic-definable properties: 

Proposition 20 Let 4>(R) = Vx(Rx — > 6(R, x)) be a myopic sentence. Then 
there exists an FO(C) -formula (p + (x) such that, for all models M and teams 
X, 

M hx <J> + (x) if and only if (M,X(x)) \= <p(R). 

Proof. Consider 8(R,x): by Theorem [T6l there exists an FO(C)-formula 
9 + (x) such that for all models M and teams X, 

M |= x 6 + (x) ^^ VseX: (M,X(x)) |= s 9(R,x) 
^^ (M,X(x))\=Vx(R£->9(R,x)), 

as required. 

It is also easy to see that all myopic properties are union-closed. We will 
now prove the converse implication: if <p{R) is a first order sentence that 
defines a union-closed property of relations, then it is equivalent to some 
myopic sentence. From this preservation theorem it will follow at once that 
all union-closed first-order properties of relations are definable in Inclusion 
Logic. 

First, let us recall some model-theoretic machinery: 

16 



Definition 21 (u-hig models) A model A of signature S is u-big if for 
all finite tuples a of elements of it and for all models (B, b, S) such that 
(A, a) = (B, b) there exists a relation P over A such that (A, a, P) = (B, b, S) . 

Definition 22 (w-saturated models) A model A is u-saturated if for ev- 
ery finite set C of elements of A, all complete 1-types over C with respect to 
A are realized in A. 

The proofs of the following model-theoretic results can be found in [To] . 

Theorem 23 ([15j, Theorem 8.2.1) Let A be a model. Then A has an 
u-big elementary extension. 

Theorem 24 ([15j, Lemma 8.3.4) Let A and B be u-saturated structures 
over a finite signature and such that, for all sentences x{R) ^ n which R occurs 
only positively, 

Then there are elementary substructures C and D of A and B and a bijective 
homomorphism f : C — >■ D which fixes all relation symbols except R. 

Theorem 25 (Essentially |15j . Theorem 8.1.2) Suppose that A isu-big 
and a is a finite tuple of elements. Then (A, a) is u-saturated. 

Using these results, we can prove our representation theorem: 

Theorem 26 Let (f)(R) be a first order sentence that defines a union-closed 
property of R. Then <fi is equivalent to some myopic sentence. Consequently, 
every first-order definable union-closed property of relations is definable in 
FO(C). 

Proof. Let T = {<f/(R) : (j)'{R) is myopic, (j){R) |= (f)'(R)}. If we can show 
that T |= <p(R), we are done: indeed, by compactness this implies that <ft 
is equivalent to a finite conjunction Wx(Rx — > 9i(R,x)) A ... A Wx(Rx — > 
9 n (R,x)) of myopic sentences, which of course is equivalent to Wx(Rx — > 
(9 1 (R,x)A...A9 n (R,x))). 

So, let B' be a model satisfying T, and let B be an u-big extension of B' . 
We need to show that B |= (f)(R) (and, therefore, B' \= <f>(R)). 
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Now choose an arbitrary tuple b of elements such that P |= Rb, and let 
T be the theory 

T = {Ra, (p(R)} U {ip(R, a) : R only negative in ip, B (= V(#, 6)}. 

T is satisfiable: indeed, if it were not then by compactness there would be 
formulas ipx(R, x), . . . , ip n (R, x) in which R occurs only negatively such that 



<j>{R) |= Wx(Rx ->■ \J -*ipi{R,x) 



Ki<n 



But this is a myopic formula, and therefore it would have to hold in B, which 
is a contradiction since B \= ipi(R, b) for all 1 < % < n. 

Now let (A, a) be an ^-saturated model of T. If R occurs only positively 
in x(-R) %) an d A 1= x{R> o)> then B \= x(P, b); otherwise _ >x(i?, a) would be 
in T. Furthermore, since i? is w-big, (B, b) is w-saturated. Thus, there are 
elementary substructures (C, a) and (D, b) of (A, a) and (5, b) and a bijective 
homomorphism / : C — >■ D that fixes all relations except i?. 

Let 5 = f(R c ). Then 5 C i? 13 , since / is an homomorphism; and / is 
actually an isomorphism between (C,a) and (D[S/R\,b), since / fixes even 
R between these two models. Now, C \= RaA 4>(R), whence D \= Sb A <p(S). 
Furthermore, since S C R we have that .D |= Vx(S'x — >■ i?x). 

Now, (D, b) is an elementary substructure of (B, b) and B is a w-big 
model: therefore, there exists a relation P over i? such that (D, b, S) = 
(5, 6, P). In particular, this implies that B \= Pb A <p(P) A P C R: there is 
a subset of R B which contains b and satisfies 0. 

But we chose b as an arbitrary tuple in R B . So we have that R B is the 
union of a family of relations Pg, where 6 ranges over R B ; and P |= 0(Pj) for 
all such 6. Since 0(P) is closed under unions, this implies that B \= <fr(R), as 
required. 

5 An EF Game for Inclusion Logic 

We will now define an Ehrenfeucht-Fraisse game for Inclusion Logic. This 
game is an obvious variant of the one defined in [21] for Dependence Logic: 

Definition 27 Let A and B be two models over the same signature, let n G 
N, and let X and Y be two teams with the same domain over A and B, 
respectively. Then the two-player game G n (A, X, P, Y) is defined as follows: 



1. The initial position p is (X,Y); 

2. For each i e {1, . . . , n}, let pi_\ be (Xj_i, Yj-i). Then Spoiler makes a 
move of one of the following types: 

Splitting: Spoiler chooses two teams X' , X" such that X^\ = X'UX" . 
Then Duplicator chooses two teams Y' , Y" such that Yj_i =Y'U 
Y" . Then Spoiler chooses whether the next position p, is (X', Y') 
or(X",Y"). 

Supplementing: Spoiler chooses a variable v and a function H : 
X{-i — > "P(Dom(A))\{0}. Then Duplicator chooses a function 
K : Yj_i — > V (Dom(B))\{(l\} , and the new position pi is 

{X^H/vlY^K/v}). 

Duplication: Spoiler chooses a variable v. The next position pi is 

(X^A/v^Y^iB/v]). 

3. The final position p n = (X n , Y n ) is winning for Spoiler if and only 
if there exists a formula a which is either a first-order literal, or an 
inclusion atom, such that A \=x„ ct, but B ^= Yn ol. Otherwise, the final 
position is winning for Duplicator. 

The rank of an Inclusion Logic formula is also denned much in the same way 
as the rank of a Dependence Logic formula: 

Definition 28 Let <fi be an FO(C) -formula. Then we define its rank rk(</>) e 
N by structural induction on <f>, as follows: 

1. If cf) is a first-order literal or an inclusion atom, rk(</>) = 0; 

2. rk(> A 0) = max(rk(^), rk(0)); 

3. rk(^> V 9) = max(rk(^), rk(0)) + 1; 

4. rk(3vip) = rk(Vw^) = rk(^) + 1. 

The next theorem shows that our games behave as required with respect 
to our notion of rank. Its proof is practically the same as for the EF game 
for FO(D) in [21]. 
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Theorem 29 Let A and B be models and X and Y teams on A and B. 
Then Duplicator has a winning strategy in G n (A, X, B, Y) if and only if 

A \= x ==► B \= Y <j) 
holds for all FO(C) -formulas <ft with rk(0) < n. 

Due to the equivalence between FO(C) and GFP + we can conclude at 
once that the EF game for Inclusion Logic is also a novel EF game for GFP + , 
rather different in structure from the one introduced in [2]. It may be hoped 
that this new game and its variants could be of some use for studying the 
expressive power of fixed point logics. 

Although the EF game for Inclusion Logic has a clear second order flavour, 
it is still manageable: we will next show that Duplicator has a concrete 
winning strategy, when the models are simple enough. 

Proposition 30 Let A = {1, . . . , n} and B = {1, . . . , n + 1} be two finite 
models over the empty signature. Then for all FO(C) -sentences (ft of rank 
< n, 

A\=(/> =* B |= (f). 

Proof. It suffices to specify a winning strategy for Duplicator in the game 
G n (A, {0}, B, {0}). Our aim for such a strategy is to preserve the following 
property for n turns: 

• If the current position is (X, Y) then 

Y = \J{n[X]:neI(A,B)}, (3) 

where I(A, B) is the set of all 1-1 functions A — > B, ir[X] = {tt(s) : s € X} 
and 7r(s) denotes the assignment tc o s. 

The property (J3J) is trivially true for ({0}, {0})- Furthermore, as long as 
(|3J) holds, Spoiler does not win. Indeed, if a is a first-order literal such that 
A \= s a for all s G X, then, since all s' G Y are of the form 7f(s) for some 
s G X and the signature is empty, we have B \= s > a for all s' 6 7. Similarly, 
suppose that A \= x u C w, and let s' G Y. Then s' = ir(s) for some s G X 
and some n G I (A, B), and there exists a h G X such that u(s) = w(h). But 
then ir(h) G Y, and w(ir(h)) = u(ir(s)) = u(s'), as required. 

Thus, we only need to verify that Duplicator can maintain property ([3]) 
for n rounds. Suppose that at round i < n the current position (X, Y) has 
property (J3J), and let us consider the possible moves of Spoiler: 
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Splitting: Suppose that Spoiler splits X into X\ and X 2 . Then let Dupli- 
cator reply by splitting Y into Yj = \J{s' G Y : 3n G I{A, B)3s G 
Xj such that 7r(s) = s'} for j G {1,2}. Then Y = Y 1 U F 2 , and it 
is straightforward to check that both possible successors (Xi,Yx) and 
(X2,Y 2 ) have property (J3J). 

Supplementing: Suppose that Spoiler chooses a function iJ G C(X, A). 
Then let Duplicator reply with the function K G C(Y, B) defined as 

K(s') = |J{vr(a) : 3vr G I(A, B)3s G X such that tt(s) = s' and 

o6iJ(s)} 

for each s' G F. We leave it to the reader to verify that the next 
position (X[H/v],Y[K/v\) has property (|3]). 

Duplication: If Spoiler chooses a duplication move, the next position is 
(X[M/v],Y[M/v}). We check that this new position satisfies property 
©• 

Let s[a/i>] G XL4/t>] and let n G I(A,B). Since s G X, we have that 
7r(s) G F, and therefore ir(s)[b/v] = ir(s[a/v]) G F[-B/f]. 

Conversely, let s' G K and let 6 be any element of B. We need to show 
that s'[b/v] = ir(s[a/v}) for some it G I (A, B), s G X and a G Dom(A). 

By induction hypothesis, there exists n G I(A, B) and s G X such that 
7r(s) = s'. If 6 is in the range of n, then s'[6/f] = ir(s[a/v}), where 
a = 7r~ l (b). On the other hand, if b is not in the range of it, then 
since % < n, there is an element a G A which is not in the range of s. 
Now s[a/v] G X[A/v], and s'[b/v] = n'(s[a/v]), where n' G I(A, B) is a 
function such that n\a) = b and 7r'(c) = 7r(c) for all c in the range of 
s. 

From Proposition [30] it immediately follows that even cardinality (and other 
similar cardinality properties) of finite models is not definable in Inclusion 
Logic. This, of course, follows already from the equivalence of FO(C) and 
GFP + , as it is well-known that non-trivial cardinality properties are not 
definable in fixed point logics. 
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6 Conclusions and Further Work 

In this work, we proved a number of results concerning the expressive power 
of inclusion Logic. We showed that this logic is strictly weaker than T,\, and 
corresponds in fact to Positive Greatest Fixed Point Logic. Furthermore, we 
showed that all union-closed first-order properties of relations correspond to 
the satisfaction conditions of Inclusion Logic formulas, and we also defined 
a new Ehrenfeucht-Fraisse game for it. 

Due to the connection between Inclusion Logic and fixed point logics, 
the study of this formalism may have interesting applications in descriptive 
complexity theory. In |5J, Durand and Kontinen established some corre- 
spondences between fragments of Dependence Logic and fragments of NP; in 
the same way, one may hope to find correspondences between fragments of 
Inclusion Logic and fragments of PTIME. 

Furthermore, we may inquire about extensions of Inclusion Logic. For 
example, is there any natural union-closed dependency notion D such that 
FO(C,D) defines all E* union-closed properties of relations? By the results 
in Section HI we know that if this is the case, then D is not first-order. 
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